The Genetic Interpretation of Area under the ROC Curve in Genomic Profiling
نویسندگان
چکیده
Genome-wide association studies in human populations have facilitated the creation of genomic profiles which combine the effects of many associated genetic variants to predict risk of disease. The area under the receiver operator characteristic (ROC) curve is a well established measure for determining the efficacy of tests in correctly classifying diseased and non-diseased individuals. We use quantitative genetics theory to provide insight into the genetic interpretation of the area under the ROC curve (AUC) when the test classifier is a predictor of genetic risk. Even when the proportion of genetic variance explained by the test is 100%, there is a maximum value for AUC that depends on the genetic epidemiology of the disease, i.e. either the sibling recurrence risk or heritability and disease prevalence. We derive an equation relating maximum AUC to heritability and disease prevalence. The expression can be reversed to calculate the proportion of genetic variance explained given AUC, disease prevalence, and heritability. We use published estimates of disease prevalence and sibling recurrence risk for 17 complex genetic diseases to calculate the proportion of genetic variance that a test must explain to achieve AUC = 0.75; this varied from 0.10 to 0.74. We provide a genetic interpretation of AUC for use with predictors of genetic risk based on genomic profiles. We provide a strategy to estimate proportion of genetic variance explained on the liability scale from estimates of AUC, disease prevalence, and heritability (or sibling recurrence risk) available as an online calculator.
منابع مشابه
Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation
This review provides the basic principle and rational for ROC analysis of rating and continuous diagnostic test results versus a gold standard. Derived indexes of accuracy, in particular area under the curve (AUC) has a meaningful interpretation for disease classification from healthy subjects. The methods of estimate of AUC and its testing in single diagnostic test and also comparative studies...
متن کاملKarstic water exploration using the Schlumberger VES and dipole–dipole resistivity profiling surveys in the Tepal area, west of Shahrood, Iran
The need for clean groundwater resources to have sustainable development in a country is undoubted. Due to the importance and high quality of karstic waters in supplying water in Iran especially in Shahrood city, it is attempted in this research work to recognize and explore karstic waters in southwest of Tepal area, Shahrood. For this purpose, integration of the results obtained from the metho...
متن کاملEffect of ghrelin on serum metabolites in Alzheimer’s disease model rats; a metabolomics studies based on 1H-NMR technique
Objective(s): Alzheimer’s disease (AD) is dysfunction of the central nervous system and as a neurodegenerative disease. The objective of this work is to investigate metabolic profiling in the serum of animal models of AD compared to healthy controls and then to peruse the role of ghrelin as a therapeutic approach for the AD.Materials and Methods: Nuclear magnetic resonance (NMR) technique was u...
متن کاملEvaluation of Salivary Level of Heat Shock Protein 70 in Patients with Breast Cancer
Introduction: Breast cancer is the most common cancer diagnosed among women worldwide. Increased molecular and genetic information about cancer has improved diagnostic, screening, and treatment methods for cancer. Heat shock protein 70 (HSP70) is overexpressed in breast cancer patients and involved in malignant properties of breast cancer. Due to the noninvasive nature of saliva collection and ...
متن کاملA Comment on the ROC Curve and the Area Under it as Performance Measures
The Receiver Operating Characteristic (ROC) curve is a two dimensional measure of classification performance. The area under the ROC curve (AUC) is a scalar measure gauging one facet of performance. In this note, five idealized models are utilized to relate the shape of the ROC curve, and the area under it, to features of the underlying distribution of forecasts. This allows for an interpretati...
متن کامل